atari benchmark
Reviews: Superposition of many models into one
I appreciate that the authors re-implemented the CIFAR benchmark I had requested. However, I'm still unconvinced of the significance or the originality of the proposed approach. For me, two fundamental issues remain: 1) The proposed approach is conceptually very similar to the masking proposed in Masse et al. (2018) that I mentioned in my review. The only difference is essentially masking with a {1,-1} vector vs. masking with a {0,1} vector. For sufficiently sparse masks (as used in Masse et al.), the latter approach will also produce largely non-overlapping feature subsets for different tasks, so I don't see this as a huge difference.
A Review for Deep Reinforcement Learning in Atari:Benchmarks, Challenges, and Solutions
The Arcade Learning Environment (ALE) is proposed as an evaluation platform for empirically assessing the generality of agents across dozens of Atari 2600 games. ALE offers various challenging problems and has drawn significant attention from the deep reinforcement learning (RL) community. From Deep Q-Networks (DQN) to Agent57, RL agents seem to achieve superhuman performance in ALE. However, is this the case? In this paper, to explore this problem, we first review the current evaluation metrics in the Atari benchmarks and then reveal that the current evaluation criteria of achieving superhuman performance are inappropriate, which underestimated the human performance relative to what is possible. To handle those problems and promote the development of RL research, we propose a novel Atari benchmark based on human world records (HWR), which puts forward higher requirements for RL agents on both final performance and learning efficiency. Furthermore, we summarize the state-of-the-art (SOTA) methods in Atari benchmarks and provide benchmark results over new evaluation metrics based on human world records. We concluded that at least four open challenges hinder RL agents from achieving superhuman performance from those new benchmark results. Finally, we also discuss some promising ways to handle those problems.
- Research Report (1.00)
- Overview (0.67)
- Leisure & Entertainment > Sports (0.94)
- Leisure & Entertainment > Games > Computer Games (0.47)
Now DeepMind's New AI Agent Outperforms Humans
Recently, a team of researchers from DeepMind, Google Brain and the University of Toronto unveiled a new reinforcement learning agent known as DreamerV2. This reinforcement learning agent learns behaviours purely from the predictions in the compact latent space of a powerful world model. According to the researchers, DreamerV2 is the first agent to achieve human-level performance on the Atari benchmark. DreamerV2, a collaboration between DeepMind, @GoogleAI and the @UofT, is the first RL agent based on a world model to achieve human-level performance on the Atari benchmark. From driverless cars to beating Go world champions, reinforcement learning has come a long way.